C++ 的 sizeof 是怎么实现的？

您所在的位置：网站首页 › 数组 sizeof › C++ 的 sizeof 是怎么实现的？

C++ 的 sizeof 是怎么实现的？

2023-03-17 08:28| 来源: 网络整理| 查看: 265

sizeof的东西会被编译器直接替换掉，即使是汇编代码都只能看到一个常量，所以下面有童鞋说看反汇编源码是不行的，因为已经在编译器内部替换掉了（更严谨的说法是，VLA是特殊情况，这是后面的代码说明中有提到）。下面以Clang对sizeof的处理来看sizeof的实现。

在Clang的实现中，在lib/AST/ExprConstant.cpp中有这样的方法：

bool IntExprEvaluator::VisitUnaryExprOrTypeTraitExpr

这个方法的实现如此：

switch(E->getKind()) { case UETT_AlignOf: { if (E->isArgumentType()) return Success(GetAlignOfType(Info, E->getArgumentType()), E); else return Success(GetAlignOfExpr(Info, E->getArgumentExpr()), E); } case UETT_VecStep: { QualType Ty = E->getTypeOfArgument(); if (Ty->isVectorType()) { unsigned n = Ty->castAs()->getNumElements(); // The vec_step built-in functions that take a 3-component // vector return 4. (OpenCL 1.1 spec 6.11.12) if (n == 3) n = 4; return Success(n, E); } else return Success(1, E); } case UETT_SizeOf: { QualType SrcTy = E->getTypeOfArgument(); // C++ [expr.sizeof]p2: "When applied to a reference or a reference type, // the result is the size of the referenced type." if (const ReferenceType *Ref = SrcTy->getAs()) SrcTy = Ref->getPointeeType(); CharUnits Sizeof; if (!HandleSizeof(Info, E->getExprLoc(), SrcTy, Sizeof)) return false; return Success(Sizeof, E); } } llvm_unreachable("unknown expr/type trait"); }

然后通过这个方法，我们可以顺藤摸瓜，发现sizeof的处理其实是在HandleSizeof这个方法内，结果是会存储在Sizeof这个CharUnits中，而一个CharUnits是Clang内部的一个表示，引用Clang的注释如下

/// CharUnits - This is an opaque type for sizes expressed in character units. /// Instances of this type represent a quantity as a multiple of the size /// of the standard C type, char, on the target architecture. As an opaque /// type, CharUnits protects you from accidentally combining operations on /// quantities in bit units and character units. /// /// In both C and C++, an object of type 'char', 'signed char', or 'unsigned /// char' occupies exactly one byte, so 'character unit' and 'byte' refer to /// the same quantity of storage. However, we use the term 'character unit' /// rather than 'byte' to avoid an implication that a character unit is /// exactly 8 bits. /// /// For portability, never assume that a target character is 8 bits wide. Use /// CharUnit values wherever you calculate sizes, offsets, or alignments /// in character units.

然后，我们找寻HandleSizeof方法：

/// Get the size of the given type in char units. static bool HandleSizeof(EvalInfo &Info, SourceLocation Loc, QualType Type, CharUnits &Size) { // sizeof(void), __alignof__(void), sizeof(function) = 1 as a gcc // extension. if (Type->isVoidType() || Type->isFunctionType()) { Size = CharUnits::One(); return true; } if (!Type->isConstantSizeType()) { // sizeof(vla) is not a constantexpr: C99 6.5.3.4p2. // FIXME: Better diagnostic. Info.Diag(Loc); return false; } Size = Info.Ctx.getTypeSizeInChars(Type); return true; }

走到这里，我们就知道了为什么会被替换掉了，如你这里是void或者Function type，编译器都直接替换为CharUnits::One()这个常量（即一个Char的大小），所以这就是汇编也只能看到常量的原因，毕竟汇编是后面CodeGen的事情，而这里是在CodeGen之前发生的了。而在这里也会判断Type是不是ConstantSizeType，因为需要在编译期计算出来，而注释则是针对VLA，有兴趣的同学可以按照注释的C99地方去看说的是什么。接下来则是把Type传给getTypeSizeInChars方法了。

OK，接下来我们再一步一步的走下去，看getTypeSizeInChars做了什么。

/// getTypeSizeInChars - Return the size of the specified type, in characters. /// This method does not work on incomplete types. CharUnits ASTContext::getTypeSizeInChars(QualType T) const { return getTypeInfoInChars(T).first; }

走到这里的时候，虽然我们就算不走下去都能知道这个方法是返回特定类型的大小了，但是我们还是要打破沙锅问到底，看到底是怎么实现的。于是我们继续走getTypeInfoChars()这个方法。

std::pair ASTContext::getTypeInfoInChars(QualType T) const { return getTypeInfoInChars(T.getTypePtr()); }

走到这里，我们也知道为什么会有first了，因为这个方法返回的是一个std::pair，接下来我们可以发现调用的还是getTypeInChar方法，但是参数一个TypePointers，于是我们找这个重载方法：

std::pair ASTContext::getTypeInfoInChars(const Type *T) const { if (const ConstantArrayType *CAT = dyn_cast(T)) return getConstantArrayInfoInChars(*this, CAT); TypeInfo Info = getTypeInfo(T); return std::make_pair(toCharUnitsFromBits(Info.Width), toCharUnitsFromBits(Info.Align)); }

随后，我们可以发现是getTypeInfo这个方法，然后我们找到对应的代码：

TypeInfo ASTContext::getTypeInfo(const Type *T) const { TypeInfoMap::iterator I = MemoizedTypeInfo.find(T); if (I != MemoizedTypeInfo.end()) return I->second; // This call can invalidate MemoizedTypeInfo[T], so we need a second lookup. TypeInfo TI = getTypeInfoImpl(T); MemoizedTypeInfo[T] = TI; return TI; }

然后我们找到了这个，对于MemorizedTypeInfo我们暂时不需要关心，我们也能发现需要的东西其实在getTypeInfoImpl里面

/// getTypeInfoImpl - Return the size of the specified type, in bits. This /// method does not work on incomplete types. /// /// FIXME: Pointers into different addr spaces could have different sizes and /// alignment requirements: getPointerInfo should take an AddrSpace, this /// should take a QualType, &c. TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { uint64_t Width = 0; unsigned Align = 8; bool AlignIsRequired = false; switch (T->getTypeClass()) { #define TYPE(Class, Base) #define ABSTRACT_TYPE(Class, Base) #define NON_CANONICAL_TYPE(Class, Base) #define DEPENDENT_TYPE(Class, Base) case Type::Class: #define NON_CANONICAL_UNLESS_DEPENDENT_TYPE(Class, Base) \ case Type::Class: \ assert(!T->isDependentType() && "should not see dependent types here"); \ return getTypeInfo(cast(T)->desugar().getTypePtr()); #include "clang/AST/TypeNodes.def" llvm_unreachable("Should not see dependent types"); case Type::FunctionNoProto: case Type::FunctionProto: // GCC extension: alignof(function) = 32 bits Width = 0; Align = 32; break; case Type::IncompleteArray: case Type::VariableArray: Width = 0; Align = getTypeAlign(cast(T)->getElementType()); break; case Type::ConstantArray: { const ConstantArrayType *CAT = cast(T); TypeInfo EltInfo = getTypeInfo(CAT->getElementType()); uint64_t Size = CAT->getSize().getZExtValue(); assert((Size == 0 || EltInfo.Width getElementType()); Width = EltInfo.Width * VT->getNumElements(); Align = Width; // If the alignment is not a power of 2, round up to the next power of 2. // This happens for non-power-of-2 length vectors. if (Align & (Align-1)) { Align = llvm::NextPowerOf2(Align); Width = llvm::RoundUpToAlignment(Width, Align); } // Adjust the alignment based on the target max. uint64_t TargetVectorAlign = Target->getMaxVectorAlign(); if (TargetVectorAlign && TargetVectorAlign getKind()) { default: llvm_unreachable("Unknown builtin type!"); case BuiltinType::Void: // GCC extension: alignof(void) = 8 bits. Width = 0; Align = 8; break; case BuiltinType::Bool: Width = Target->getBoolWidth(); Align = Target->getBoolAlign(); break; case BuiltinType::Char_S: case BuiltinType::Char_U: case BuiltinType::UChar: case BuiltinType::SChar: Width = Target->getCharWidth(); Align = Target->getCharAlign(); break; case BuiltinType::WChar_S: case BuiltinType::WChar_U: Width = Target->getWCharWidth(); Align = Target->getWCharAlign(); break; case BuiltinType::Char16: Width = Target->getChar16Width(); Align = Target->getChar16Align(); break; case BuiltinType::Char32: Width = Target->getChar32Width(); Align = Target->getChar32Align(); break; case BuiltinType::UShort: case BuiltinType::Short: Width = Target->getShortWidth(); Align = Target->getShortAlign(); break; case BuiltinType::UInt: case BuiltinType::Int: Width = Target->getIntWidth(); Align = Target->getIntAlign(); break; case BuiltinType::ULong: case BuiltinType::Long: Width = Target->getLongWidth(); Align = Target->getLongAlign(); break; case BuiltinType::ULongLong: case BuiltinType::LongLong: Width = Target->getLongLongWidth(); Align = Target->getLongLongAlign(); break; case BuiltinType::Int128: case BuiltinType::UInt128: Width = 128; Align = 128; // int128_t is 128-bit aligned on all targets. break; case BuiltinType::Half: Width = Target->getHalfWidth(); Align = Target->getHalfAlign(); break; case BuiltinType::Float: Width = Target->getFloatWidth(); Align = Target->getFloatAlign(); break; case BuiltinType::Double: Width = Target->getDoubleWidth(); Align = Target->getDoubleAlign(); break; case BuiltinType::LongDouble: Width = Target->getLongDoubleWidth(); Align = Target->getLongDoubleAlign(); break; case BuiltinType::NullPtr: Width = Target->getPointerWidth(0); // C++ 3.9.1p11: sizeof(nullptr_t) Align = Target->getPointerAlign(0); // == sizeof(void*) break; case BuiltinType::ObjCId: case BuiltinType::ObjCClass: case BuiltinType::ObjCSel: Width = Target->getPointerWidth(0); Align = Target->getPointerAlign(0); break; case BuiltinType::OCLSampler: // Samplers are modeled as integers. Width = Target->getIntWidth(); Align = Target->getIntAlign(); break; case BuiltinType::OCLEvent: case BuiltinType::OCLImage1d: case BuiltinType::OCLImage1dArray: case BuiltinType::OCLImage1dBuffer: case BuiltinType::OCLImage2d: case BuiltinType::OCLImage2dArray: case BuiltinType::OCLImage3d: // Currently these types are pointers to opaque types. Width = Target->getPointerWidth(0); Align = Target->getPointerAlign(0); break; } break; case Type::ObjCObjectPointer: Width = Target->getPointerWidth(0); Align = Target->getPointerAlign(0); break; case Type::BlockPointer: { unsigned AS = getTargetAddressSpace( cast(T)->getPointeeType()); Width = Target->getPointerWidth(AS); Align = Target->getPointerAlign(AS); break; } case Type::LValueReference: case Type::RValueReference: { // alignof and sizeof should never enter this code path here, so we go // the pointer route. unsigned AS = getTargetAddressSpace( cast(T)->getPointeeType()); Width = Target->getPointerWidth(AS); Align = Target->getPointerAlign(AS); break; } case Type::Pointer: { unsigned AS = getTargetAddressSpace(cast(T)->getPointeeType()); Width = Target->getPointerWidth(AS); Align = Target->getPointerAlign(AS); break; } case Type::MemberPointer: { const MemberPointerType *MPT = cast(T); std::tie(Width, Align) = ABI->getMemberPointerWidthAndAlign(MPT); break; } case Type::Complex: { // Complex types have the same alignment as their elements, but twice the // size. TypeInfo EltInfo = getTypeInfo(cast(T)->getElementType()); Width = EltInfo.Width * 2; Align = EltInfo.Align; break; } case Type::ObjCObject: return getTypeInfo(cast(T)->getBaseType().getTypePtr()); case Type::Adjusted: case Type::Decayed: return getTypeInfo(cast(T)->getAdjustedType().getTypePtr()); case Type::ObjCInterface: { const ObjCInterfaceType *ObjCI = cast(T); const ASTRecordLayout &Layout = getASTObjCInterfaceLayout(ObjCI->getDecl()); Width = toBits(Layout.getSize()); Align = toBits(Layout.getAlignment()); break; } case Type::Record: case Type::Enum: { const TagType *TT = cast(T); if (TT->getDecl()->isInvalidDecl()) { Width = 8; Align = 8; break; } if (const EnumType *ET = dyn_cast(TT)) { const EnumDecl *ED = ET->getDecl(); TypeInfo Info = getTypeInfo(ED->getIntegerType()->getUnqualifiedDesugaredType()); if (unsigned AttrAlign = ED->getMaxAlignment()) { Info.Align = AttrAlign; Info.AlignIsRequired = true; } return Info; } const RecordType *RT = cast(TT); const RecordDecl *RD = RT->getDecl(); const ASTRecordLayout &Layout = getASTRecordLayout(RD); Width = toBits(Layout.getSize()); Align = toBits(Layout.getAlignment()); AlignIsRequired = RD->hasAttr(); break; } case Type::SubstTemplateTypeParm: return getTypeInfo(cast(T)-> getReplacementType().getTypePtr()); case Type::Auto: { const AutoType *A = cast(T); assert(!A->getDeducedType().isNull() && "cannot request the size of an undeduced or dependent auto type"); return getTypeInfo(A->getDeducedType().getTypePtr()); } case Type::Paren: return getTypeInfo(cast(T)->getInnerType().getTypePtr()); case Type::Typedef: { const TypedefNameDecl *Typedef = cast(T)->getDecl(); TypeInfo Info = getTypeInfo(Typedef->getUnderlyingType().getTypePtr()); // If the typedef has an aligned attribute on it, it overrides any computed // alignment we have. This violates the GCC documentation (which says that // attribute(aligned) can only round up) but matches its implementation. if (unsigned AttrAlign = Typedef->getMaxAlignment()) { Align = AttrAlign; AlignIsRequired = true; } else { Align = Info.Align; AlignIsRequired = Info.AlignIsRequired; } Width = Info.Width; break; } case Type::Elaborated: return getTypeInfo(cast(T)->getNamedType().getTypePtr()); case Type::Attributed: return getTypeInfo( cast(T)->getEquivalentType().getTypePtr()); case Type::Atomic: { // Start with the base type information. TypeInfo Info = getTypeInfo(cast(T)->getValueType()); Width = Info.Width; Align = Info.Align; // If the size of the type doesn't exceed the platform's max // atomic promotion width, make the size and alignment more // favorable to atomic operations: if (Width != 0 && Width getMaxAtomicPromoteWidth()) { // Round the size up to a power of 2. if (!llvm::isPowerOf2_64(Width)) Width = llvm::NextPowerOf2(Width); // Set the alignment equal to the size. Align = static_cast(Width); } }

一切真相大白了，已不需要解释了 :-)

【本文地址】

C++ 的 sizeof 是怎么实现的？

C++ 的 sizeof 是怎么实现的？

今日新闻

推荐新闻